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DETAILED ACTION 

1 . Claims 1-29 have been examined. 

Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and placed of 
record in the file: Amendment as received on 8/25/2005. 

Claim Objections 

3. Claim 1 is objected to because of the following informalities: In line 8, replace 
"instruction" with —instructions—. Appropriate correction is required. 

4. Claim 17 is objected to because of the following informalities: Please replace the period 
at the end of the 2 nd to last paragraph with more appropriate punctuation. Appropriate correction 
is required. 

5. Claim 22 is objected to because of the following informalities: In line 10, the phrase 
"wherein some of said instruction control units each comprising an. . is not proper grammar. In 
addition, in line 15, the phrase "both from said single instruction memory" is had to understand 
when read with the rest of the claim as a whole. The examiner believes that applicant is trying to 
claim that both the first and second series are from the single instruction memory, but this should 
be made more clear. Appropriate correction is required. 

6. Claim 27 is objected to because of the following informalities: Please replace "having" 
with —have— (or an equivalent) in line 9. Appropriate correction is required. 
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Claim Rejections - 35 USC § 112 

7. The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the 
subject matter which the applicant regards as his invention. 

8. Claims 24-26 are rejected under 35 U.S.C. 1 12, second paragraph, as being indefinite for 
failing to particularly point out and distinctly claim the subject matter which applicant regards as 
the invention. 

9. Claims 24-26 recite the limitation "the single series of instructions" in the last paragraph. 
There is insufficient antecedent basis for this limitation in the claim. It will be interpreted as just 
instructions. 

Claim Rejections - 35 USC §102 

10. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the 
basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by another filed 
in the United States before the invention by the applicant for patent or (2) a patent granted on an application for 
patent by another filed in the United States before the invention by the applicant for patent, except that an 
international application filed under the treaty defined in section 351(a) shall have the effects for purposes of this 
subsection of an application filed in the United States only if the international application designated the United 
States and was published under Article 21(2) of such treaty in the English language. 

11. Claims 17 and 25 are rejected under 35 U.S.C. 102(e) as being anticipated by Fernando et 
al., U.S. Patent No. 6,272,616 (as applied in the previous Office Action and herein referred to as 
Fernando). 

12. Referring to claim 17, Fernando has taught a processor control apparatus comprising: 
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a) a single instruction memory for storing a plurality of series of instructions to be executed by a 
plurality of arithmetic units. See Fig. 1, components 12 and 26. Fig. 7 shows multiple series of 
instructions being executed by different units. 

b) an instruction decoder for decoding a series of instructions from said instruction memory, and 
outputting a decoded result to any of said plurality of arithmetic units. See Fig. 1, component 22a 
and 22b. 

c) a selector selectively switching between a plurality of series of instructions from said 
instruction memory to be decoded by said instruction decoder, and supplying a series of 
instructions thus selected to said instruction decoder. See Fig.l, component 21, and claim 44 of 
Fernando. 

d) wherein said instruction memory has a plurality of ports for issuing said series of instructions 
to said instruction decoder. See Fig. 1 component 12 and note that multiple ports are connected 
to multiple fetch units and ultimately to multiple decoders. 

e) wherein said plurality of series of instructions are issued to said plurality of arithmetic units 
from the instruction memory to be simultaneously and independently driven both upon the 
instructions including different commands for each of the plurality of instruction control units 
and including a same command for each of the plurality of instruction control units. Looking at 
Fig. 1, it can be seen that when multiplexer 21 selects instructions from fetcher 20a, then the units 
26 in components 24a and 24b will be simultaneously driven (by a first instruction stream). This 
is shown more specifically in Fig.3. Note that this is the purpose of SEMD instructions: to have 
multiple units execute the same instruction with multiple data items. On the other hand, looking 
at Fig. 1, it can be seen that when multiplexer 21 selects instructions from fetcher 20b, then the 
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units 26 in components 24a and 24b will be independently driven (the units in 24a will be driven 
by a first instruction stream from fetcher 20a while units in 24b will be driven by a second 
instruction stream from fetcher 20b). Note that this is the purpose of MIMD instructions: to have 
multiple units execute the multiple instructions with multiple data items. 
13. Referring to claim 25, Fernando has taught a processor comprising: 

a) a plurality of arithmetic units. See Fig. 1, components 24 and 26. 

b) a single instruction memory for storing a plurality of series of instructions to be executed by a 
plurality of arithmetic units. See Fig.l, components 12 and 26. Fig.7 shows multiple series of 
instructions being executed by different units. 

c) an instruction decoder for decoding a series of instructions from said instruction memory, and 
outputting a decoded result to any of said plurality of arithmetic units. See Fig. 1, component 22a 
and 22b. 

d) a selector for selectively switching between a plurality of series of instructions from said 
instruction memory to be decoded by said instruction decoder, and supplying a series of 
instructions thus selected to said instruction decoder. See Fig. 1, component 21, and claim 44 of 
Fernando. 

e) wherein said instruction memory has a plurality of ports for issuing said series of instructions 
to said instruction decoder. See Fig. 1 component 12 and note that multiple ports are connected 
to multiple fetch units and ultimately to multiple decoders. 

f) wherein said plurality of series of instructions are issued to said plurality of arithmetic units 
from the instruction memory to be simultaneously and independently driven. Looking at Fig. 1, it 
can be seen that when multiplexer 2 1 selects instructions from fetcher 20a, then the units 26 in 
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components 24a and 24b will be simultaneously driven (by a first instruction stream). This is 
shown more specifically in Fig.3. Note that this is the purpose of SIMD instructions: to have 
multiple units execute the same instruction with multiple data items. On the other hand, looking 
at Fig. 1 , it can be seen that when multiplexer 21 selects instructions from fetcher 20b 5 then the 
units 26 in components 24a and 24b will be independently driven (the units in 24a will be driven 
by a first instruction stream from fetcher 20a while units in 24b will be driven by a second 
instruction stream from fetcher 20b). Note that this is the purpose of M1MD instructions: to have 
multiple units execute the multiple instructions with multiple data items. 

g) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including different commands for each of the plurality of instruction 
control units. See Fig.4 and note that when the processor is in MIMD mode, a first series of 
instructions is fetched by unit 20a and sent to arithmetic units 26 in component 24a 5 and a 
completely different series of instructions is fetched by unit 20b and sent to arithmetic units 26 in 
component 24b. See column 3, lines 62-67. 

h) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including a same command for each of the plurality of instruction 
control units. See Fig.3 and note that when the processor is in SIMD mode, a single series of 
instructions is fetched by unit 20a and the series is sent to multiple arithmetic units (components 
26 in components 24a and 24b). With SIMD, the instructions are the same. See column 3, lines 
47-61, and column 1 1, line 60, to column 12, line 4. 
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Claim Rejections - 35 USC §103 

14. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

15. Claims 1-3, 5-9, 23, and 28-29 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Fernando et al., U.S. Patent No. 6,272,616, as applied above, in view of Mohamed, U.S. 
Patent No. 6,366,998. 

16. Referring to claim 1, Fernando has taught a processor control apparatus for controlling a 
plurality of arithmetic units (Fig. 4, components 26), said processor control apparatus comprising: 

a) a plurality of instruction control units issuing a series of instructions to said plurality of 
arithmetic units. See Fig. 6 and note that multiple units fetch series of instructions and pass them 
on to arithmetic units. 

b) wherein at least one of said instruction control units switches between a first execution 
process driving said plurality of arithmetic units by a single series of instructions issued from a 
single one of the plurality of instruction control units and the plurality of arithmetic units execute 
the single series of instructions concurrently upon the single series of instructions including a 
same command for each of the plurality of instruction control units. See Fig. 3 and note that 
when the processor is in SIMD mode, a single series of instructions is fetched by unit 20a and 
the series is sent to multiple arithmetic units (components 26 in components 24a and 24b). With 
SIMD, the instructions are the same. See column 3, lines 47-61, and column 11, line 60, to 
column 12, line 4. 
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c) a second execution process correspondingly driving said plurality of arithmetic units by a 
plurality of different series of instructions issued respectively from more than one of said 
plurality of instruction control units, and in which the processing by the plurality of arithmetic 
units can be synchronized. See Fig. 4 and note that when the processor is in MIMD mode, a first 
series of instructions is fetched by unit 20a and sent to arithmetic units 26 in component 24a, and 
a completely different series of instructions is fetched by unit 20b and sent to arithmetic units 26 
in component 24b. See column 3, lines 62-67. Furthermore, it should be realized that all 
arithmetic units are synchronized at least to some degree because everything is based off of clock 
signals in the system. For instance, looking at Fig. 7, it can be seen that multiple data paths 
(which include arithmetic units) operate in unison over time. Therefore, they are synchronized. 

d) Fernando has not taught that the plurality of arithmetic units execute the single series of 
instructions concurrently upon the single series of instructions including different commands for 
each of the plurality of instruction control units. However, Mohamed has taught a system in 
which a fetcher/scheduler may issue either SIMD instructions (same commands) or VLIW 
instructions (different commands) to groups of arithmetic units. See column 3, lines 51-65, 
column 4, lines 39-42, and column 5, lines 9-17. A person of ordinary skill in the art would have 
recognized that since Fernando has taught a system with multiple functional units, Fernando is 
capable of executing at least SIMD and VLIW instructions. By modifying Fernando to also 
issue different commands (VLIW) to arithmetic units, the system is able to take advantage of 
horizontal programming while also minimizing hazards. See column 1, lines 21-25, and 40-41. 
As a result, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Fernando to allow the fetcher 20a to issue VLIWs with different commands 
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to arithmetic units 26, in addition to SIMDs with same commands. One would be motivated to 
make such a combination because hazards are reduced and horizontal programming is 
advantageous. In addition, it allows for more flexibility. That is, instead of fetcher 20a only 
being limited to issuing same commands to units 26, different commands would be issued as 
well. 

17. Referring to claim 2, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1. Fernando has further taught that said at least one instruction 
control unit each perform a switching process for switching between said first execution process 
and said second execution process according to information which is contained in advance in a 
series of instructions. Note that the multiplexer 21 (Fig. 1) is what causes a switch in instruction 
stream execution. This switch is caused by a change in signal 32i, which is in response to an 
instruction (CFORK or DFORK) contained in advance in a first instruction stream. See column 
5, lines 8-16. Consequently, the system would go from the first process (SIMD/VLIW mode) to 
the second process (MIMD mode) and vice-versa. 

18. Referring to claim 3, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1. Fernando in view of Mohamed has not explicitly taught that 
when an M-th one of said instruction control units issues a second series of instructions to an N- 
th one of said arithmetic units which is performing said second execution process based on a first 
series of instructions issued by an N-th one of said instruction control units different from said 
M-th instruction control unit, said M-th instruction control unit is set in a wait state until said N- 
th arithmetic unit completes said second execution process. However, Official Notice is taken 
that structural hazards and stalling in response to hazards is well known in the art. More 
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specifically, if a functional unit is already executing instructions receiving from a first fetch unit 
and another fetch unit wants to send new instructions to the same functional unit, then if the 
previous instructions are still executing, the new instructions must be stalled. As a result, it 
would have been obvious to one of ordinary skill in the art at the time of the invention to modify 
Fernando in view of Mohamed to stall an Mth control unit that wants to issue instructions to an 
arithmetic unit that is already executing instructions sent by an Nth control unit. 

19. Referring to claim 5, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1 . Fernando has further taught a second storage element which 
operates to hold, when one of said arithmetic units executing a first series of instructions from 
one of said instruction control units is switched to execute a second series of instructions from 
another instruction control unit, data generated by the second series of instructions under 
execution by associating the data with that instruction control unit which is executing the second 
series of instructions. See Fig. 4, component 25b. 

20. Referring to claim 6, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1. Fernando has further taught that it is determined, based on an 
instruction executing state of each arithmetic unit, one of said arithmetic units to which a new 
series of instructions is to be issued by one of said instruction control units, and wherein said one 
instruction control unit is controlled based on the result of the determination so that the new 
series of instructions are directed to said one arithmetic unit thus determined. See Fig.7 and note 
that when it is determined that a particular arithmetic unit executes a fork instruction, a new 
series is to be sent to another arithmetic unit for execution. 
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21. Referring to claim 7, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1. Fernando in view of Mohamed has further taught that each of 
said series of instructions includes a VLIW type instruction. It should be noted that Fernando 
and Mohamed teach VLIW, SIMD, and MIMD. All are considered a form of VLIW. 

22. Referring to claim 8, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1 . Fernando has further taught that each of said series of 
instructions includes a series of time-sharing instructions for serially driving a plurality of ones 
of said arithmetic units. For instance, looking at Fig.7 of Fernando, the series of instructions 
corresponding to thread 2 execute from time 2 to time 7. Therefore, the thread takes 6 time units 
to execute and each individual instruction requires some portion of the overall 6 time units for 
execution. Therefore, each instruction is a time-sharing instruction, i.e., each instruction shares 
the overall 6 time units with the other instructions. 

23. Referring to claim 9, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1 . Fernando has further taught power control elements for 
controlling power supply to said arithmetic units based on their instruction executing states. See 
the abstract and column 8, lines 10-12, and note that unused processing elements are deactivated 
in order to conserve power. 

24. Referring to claim 23, Fernando has taught a processor comprising: 

a) a plurality of arithmetic units. See Fig. 1, components 24a, 24b, and 26. 

b) a plurality of instruction control units for issuing a series of instructions to drive said 
arithmetic units in a controlled manner. See Fig. 1 and note a first instruction control unit would 
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comprise at least components 20a, 22a, and 12, whereas a second unit would comprise at least 
components 20b, 22b, and 12. Each of these units issues instructions to the arithmetic units. 

c) wherein some of said instruction control units are operable to switch between a first execution 
process for synchronously driving said plurality of arithmetic units by a single series of 
instructions upon the instructions including a same command for each of the plurality of 
instruction control units. See Fig. 3 and note that when the processor is in SIMD mode, a single 
series of instructions is fetched by unit 20a and the series is sent to multiple arithmetic units 
(components 26 in components 24a and 24b). With SIMD, the instructions are the same. See 
column 3, lines 47-61, and column 11, line 60, to column 12, line 4. 

d) a second execution process independently driving said plurality of arithmetic units by a 
plurality of different series of instructions, respectively. See Fig.4 and note that when the 
processor is in MIMD mode, a first series of instructions is fetched by unit 20a and sent to 
arithmetic units 26 in component 24a, and a completely different series of instructions is fetched 
by unit 20b and sent to arithmetic units 26 in component 24b. See column 3, lines 62-67. 

e) Fernando has not taught synchronously driving the plurality of arithmetic units by a single 
series of instructions upon the instructions including different commands for each of the plurality 
of instruction control units. However, Mohamed has taught a system in which a 
fetcher/scheduler may issue either SIMD instructions (same commands) or VLIW instructions 
(different commands) to groups of arithmetic units. See column 3, lines 51-65, column 4, lines 
39-42, and column 5, lines 9-17. A person of ordinary skill in the art would have recognized that 
since Fernando has taught a system with multiple functional units, Fernando is capable of 
executing at least SIMD and VLIW instructions. By modifying Fernando to also issue different 
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commands (VLIW) to arithmetic units, the system is able to take advantage of horizontal 
programming while also minimizing hazards. See column 1, lines 21-25, and 40-41. As a result, 
it would have been obvious to one of ordinary skill in the art at the time of the invention to 
modify Fernando to allow the fetcher 20a to issue VLIWs with different commands to arithmetic 
units 26, in addition to SIMDs with same commands. One would be motivated to make such a 
combination because hazards are reduced and horizontal programming is advantageous. In 
addition, it allows for more flexibility. That is, instead of fetcher 20a only being limited to 
issuing same commands to units 26, different commands would be issued as well. 
25. Referring to claim 28, Fernando has taught a processor controlling method usable with a 
plurality of instruction control units for controlling a plurality of arithmetic units to execute a 
plurality of series of instructions, said method comprising: 

a) prescribing, in advance in a series of instructions which is to be performed, synchronous 
execution in which a plurality of predetermined ones of said arithmetic units are synchronously 
driven by a single series of instructions, or independent execution in which the plurality of 
predetermined arithmetic units are independently driven by a plurality of respective series of 
instructions. A single series of instructions may be used to drive a plurality of arithmetic units. 
This would occur when the selector 21 shown in Fig. 1 selects the instructions from bus 36. This 
corresponds to Fig.7 at times 10-12 when a single series of instruction (thread 1) is executed by 
two different data paths (arithmetic units), and hence results in the "combined control" illustrated 
in the figure. On the other hand, independent execution of a plurality of series of instructions by 
a plurality of arithmetic units is also shown in Fig.7. For instance, two series (corresponding to 
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thread 1 and 2) are independently executed by data paths 1 and 2 (arithmetic units). This would 
occur when MUX 21 in Fig. 1 selects the group of instructions from bus 35. 

b) switching between the synchronous driving upon the instructions including a same command 
for each of the plurality of instruction control units and the independent driving of the 
predetermined arithmetic units for performing a series of instructions based on the contents of 
the prescription therein. Clearly from the examples given in Fig. 7, synchronous execution of a 
single series or independent execution of multiple series of instructions is switched back and 
forth. See Fig. 7, and notice data paths 1 and 2 from time 1-12. Again, looking at Fig.l, it can be 
seen that when multiplexer 2 1 selects instructions from fetcher 20a, then the units 26 in 
components 24a and 24b will be simultaneously driven (by a first instruction stream). This is 
shown more specifically in Fig. 3. Note that this is the purpose of SIMD instructions: to have 
multiple units execute the same instruction with multiple data items. On the other hand, looking 
at Fig. 1, it can be seen that when multiplexer 21 selects instructions from fetcher 20b, then the 
units 26 in components 24a and 24b will be independently driven (the units in 24a will be driven 
by a first instruction stream from fetcher 20a while units in 24b will be driven by a second 
instruction stream from fetcher 20b). Note that this is the purpose of MIMD instructions: to have 
multiple units execute the multiple instructions with multiple data items. 

c) Fernando has not taught synchronous driving upon the instructions including different 
commands for each of the plurality of instruction control units. However, Mohamed has taught a 
system in which a fetcher/scheduler may issue either SIMD instructions (same commands) or 
VLIW instructions (different commands) to groups of arithmetic units. See column 3, lines 51- 
65, column 4, lines 39-42, and column 5, lines 9-17. A person of ordinary skill in the art would 



Application/Control Number: 09/855,776 Page 15 

Art Unit: 2183 

have recognized that since Fernando has taught a system with multiple functional units, 
Fernando is capable of executing at least SEVDD and VLIW instructions. By modifying Fernando 
to also issue different commands (VLIW) to arithmetic units, the system is able to take 
advantage of horizontal programming while also minimizing hazards. See column 1, lines 21- 

25, and 40-41 . As a result, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify Fernando to allow the fetcher 20a to issue VLIWs with different 
commands to arithmetic units 26, in addition to SIMDs with same commands. One would be 
motivated to make such a combination because hazards are reduced and horizontal programming 
is advantageous. In addition, it allows for more flexibility. That is, instead of fetcher 20a only 
being limited to issuing same commands to units 26, different commands would be issued as 
well. 

26. Referring to claim 29, Fernando has taught a processor control method to control a 
plurality of arithmetic units connected with a plurality of instruction control units, comprising: 
a) switching between simultaneously driving the plurality of arithmetic units by issuing a single 
series of instructions from one of the plurality of instruction control units upon the instructions 
including a same command for each of the plurality of instruction control units and 
independently driving each of the plurality of arithmetic units by correspondingly issuing series 
of instructions from the plurality of instruction control units, wherein the switching is performed 
based on contents of processes to be executed. See Fig. 1, component 21 and note that either a 
first series of instructions along bus 36 may be selected for decoding or a second series of 
instructions from bus 35 may be selected for decoding. When the first series of instructions 
along bus 36 is selected, the system is in SIMD mode (Fig. 3), where the plurality of arithmetic 
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units (components 26 in component 24b) are driven at the same time as a second group of 
arithmetic units (components 26 in component 24a) by the same stream of instructions. When 
the second series of instructions along bus 35 is selected, the system is in MIMD mode (Fig. 4), 
where each of the plurality of arithmetic units (components 26 in component 24b) are driven 
independently of a second group of arithmetic units (components 26 in component 24a) by 
different streams of instructions. Also, from Fig. 5, it should be noted that the switching between 
modes is done based on instructions (contents) of the processes. For instance, when a cfork 
instruction is encountered, SIMD mode is entered where the arithmetic units are simultaneously 
driven by a single stream whereas when a dfork is encountered, the arithmetic units are 
independently driven by multiple instruction streams. 

b) Fernando has not taught synchronous driving upon the instructions including different 
commands for each of the plurality of instruction control units. However, Mohamed has taught a 
system in which a fetcher/scheduler may issue either SIMD instructions (same commands) or 
VLIW instructions (different commands) to groups of arithmetic units. See column 3, lines 51- 
65, column 4, lines 39-42, and column 5, lines 9-17. A person of ordinary skill in the art would 
have recognized that since Fernando has taught a system with multiple functional units, 
Fernando is capable of executing at least SIMD and VLIW instructions. By modifying Fernando 
to also issue different commands (VLIW) to arithmetic units, the system is able to take 
advantage of horizontal programming while also minimizing hazards. See column 1, lines 21- 
25, and 40-41. As a result, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify Fernando to allow the fetcher 20a to issue VLIWs with different 
commands to arithmetic units 26, in addition to SIMDs with same commands. One would be 
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motivated to make such a combination because hazards are reduced and horizontal programming 
is advantageous. In addition, it allows for more flexibility. That is, instead of fetcher 20a only 
being limited to issuing same commands to units 26, different commands would be issued as 
well. 

27. Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Fernando in view 
of Mohamed, as applied above, and further in view of Parady, U.S. Patent No. 5,933,627 (as 
applied in the previous Office Action). 

28. Referring to claim 4, Fernando in view of Mohamed has taught a processor control 
apparatus as described in claim 1. Fernando in view of Mohamed has not taught a first storage 
element for holding a plurality of series of instructions, wherein when an M-th one of said 
instruction control units issues a second series of instructions to an N-th one of said arithmetic 
units which is performing said second execution process based on a first series of instructions 
issued by an N-th one of said instruction control units different from said M-th instruction 
control unit, said second series of instructions from said M-th instruction control unit are stored 
in said first storage element, and wherein said N-th arithmetic unit executes instructions which 
are stored in said first storage element based on information contained in said first series of 
instructions issued by said N-th instruction control unit. However, Parady has taught such a 
concept. See Fig. 3, for instance, and note the instruction control unit (say 104 in combination 
with 28), comprises a first storage element to hold thread instructions. If the thread 0 control 
unit (102 in combination with 28) issues a load that misses the cache or a jump to thread 
instruction, and the thread 1 control unit is the next to take control, then an Nth arithmetic unit 
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will execute the instructions of thread 1 based on information contained in advance in the first 
series of instructions (i.e., the load or jump). By stalling the instructions in buffers while they're 
waiting, they are available to the system when they are needed. That is, the processor will not 
have to make a time-expensive external memory fetch to get the instructions since they reside in 
buffers. Consequently, it would have been obvious to one of ordinary skill in the art at the time 
of the invention to modify Fernando in view of Mohamed to include the storage element of 
Parady for holding instructions waiting to be executed based on another series of instructions. 

29. Claims 10-12, 14-16, and 24 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Fernando, as applied above. 

30. Referring to claim 10, Fernando has taught a processor control apparatus comprising: 

a) a single instruction memory for storing a plurality of series of instructions and supplying them 
to arithmetic units. See Fig. 1, components 12 and 26. Fernando has not taught a plurality of 
instruction memories for storing a plurality of series of instructions to be executed by a plurality 
of arithmetic units. However, Official Notice is taken that having a plurality of memories for 
storing independent series of instructions is well known and accepted in the art. In addition, as 
shown in Nerwin v. Erlichmaa 168 USPQ 177 (1969), to make separable is generally not given 
patentable weight or would have been an obvious improvement. For instance, a person of 
ordinary skill in the art would have recognized that by implementing a plurality of memories 
instead of a single memory with concurrent access capabilities, as taught by Fernando (evident in 
Fig.7 where multiple streams are fetched at once), then the circuitry required to allow for 
concurrent access would be eliminated, thereby reducing the complexity and cost of the system. 
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Consequently, it would have been obvious to replace Fernando's single instruction memory with 
a plurality of instruction memories. 

b) an instruction decoder for decoding a series of instructions from said instruction memories, 
and outputting a decoded result to any of said plurality of arithmetic units. See Fig. 1, component 
22a, 22b, and 26. 

c) and a selector for selectively switching between a plurality of series of instructions from said 
instruction memories to be decoded by said instruction decoder, and supplying a series of 
instructions thus selected to said instruction decoder. See Fig. 1, component 21, and claim 44 of 
Fernando. 

d) wherein said plurality of series of instructions are issued to said plurality of arithmetic units 
from both a single one of the plurality of instruction memories and issued respectively from 
more than one of said plurality of instruction memories to enable said plurality of arithmetic 
units to be simultaneously and independently driven. See Fig.6 and note that instructions are 
received by multiple units from a single memory (via fetcher 20a) and also, each data path would 
receive its own series from its own memory and fetcher (20b). 

3 1 . Referring to claim 1 1 , Fernando has taught a processor control apparatus as described in 
claim 10. Fernando has further taught that some of said plurality of series of instructions contain 
information about selective switching between said series of instructions to be performed by said 
selector, and wherein said instruction decoder decodes said information contained in a series of 
instructions, and outputs a switching instruction to said selector. See column 6, lines 41-54, and 
column 7, lines 17-29, and note that when a CFORK or DFORK is detected by decoder 22a, a 
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signal 32i (Fig. 1) is generated, which then causes the multiplexer 21 (Fig. 1) to select another 
series of instructions, which will then be decoded. 

32. Referring to claim 12, Fernando has taught a processor control apparatus as described in 
claim 10. Fernando has further taught that some of said plurality of series of instructions contain 
a synchronizing instruction for allowing a first predetermined one of said arithmetic units and a 
second predetermined arithmetic unit to synchronously perform processes, and wherein when 
said synchronizing instruction is issued to said first predetermined arithmetic unit, said first 
predetermined arithmetic unit is set in a wait state, and an instruction decoder of said second 
predetermined arithmetic unit does not output a switching instruction to its associated selector if 
a process is being executed by said second predetermined arithmetic unit upon issuance of said 
synchronizing instruction, and does not release the wait state of said first predetermined 
arithmetic unit until said second predetermined arithmetic unit completes said process. See 
column 7, lines 52-67. Note that a WAIT instruction will pause the first arithmetic unit until a 
DJOIN instruction is executed by the second arithmetic unit (synchronization). Furthermore, the 
stream will not be switched as the second arithmetic unit will continue executing the same thread 
until it is finished (upon a DJOIN instruction). See Fig. 7. 

33. Referring to claim 14, Fernando has taught a processor control apparatus as described in 
claim 10. Fernando has further taught that each of said series of instructions includes a VLIW 
type instruction. Note that Fernando has taught SIMD and MUVDD instructions. Bother are 
forms of VLIW instructions. 

34. Referring to claim 15, Fernando has taught a processor control apparatus as described in 
claim 10. Fernando has further taught that each of said series of instructions includes a series of 
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time-sharing instructions for serially driving a plurality of ones of said arithmetic units. For 
instance, looking at Fig. 7 of Fernando, the series of instructions corresponding to thread 2 
execute from time 2 to time 7. Therefore, the thread takes 6 time units to execute and each 
individual instruction requires some portion of the overall 6 time units for execution. Therefore, 
each instruction is a time-sharing instruction, i.e., each instruction shares the overall 6 time units 
with the other instructions. 

35. Referring to claim 16, Fernando has taught a processor control apparatus as described in 
claim 10. Fernando has further taught power control elements for controlling power supply to 
said arithmetic units based on their instruction executing states. See the abstract and column 8, 
lines 10-12, and note that unused processing elements are deactivated in order to conserve 
power. 

36. Referring to claim 24, Fernando has taught a processor comprising: 

a) a plurality of arithmetic units. See Fig. 1, components 24 and 26. 

b) a single instruction memory for storing a plurality of series of instructions and supplying them 
to arithmetic units. See Fig. 1, components 12. Fernando has not taught a plurality of instruction 
memories for storing a plurality of series of instructions to be executed by a plurality of 
arithmetic units. However, Official Notice is taken that having a plurality of memories for 
storing independent series of instructions is well known and accepted in the art. In addition, as 
shown in Nerwin v. Erlichman . 168 USPQ 177 (1969), to make separable is generally not given 
patentable weight or would have been an obvious improvement. For instance, a person of 
ordinary skill in the art would have recognized that by implementing a plurality of memories 
instead of a single memory with concurrent access capabilities, as taught by Fernando (evident in 
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Fig. 7 where multiple streams are fetched at once), then the circuitry required to allow for 
concurrent access would be eliminated, thereby reducing the complexity and cost of the system. 
Consequently, it would have been obvious to replace Fernando 5 s single instruction memory with 
a plurality of instruction memories. 

c) an instruction decoder for decoding a series of instructions from said instruction memories, 
and outputting a decoded result to any of said plurality of arithmetic units. See Fig. 1, component 
22a and 22b. 

d) and a selector for selectively switching between a plurality of series of instructions from said 
instruction memories to be decoded by said instruction decoder, and supplying a series of 
instructions thus selected to said instruction decoder. See Fig. 1, component 21, and claim 44 of 
Fernando. Note that the MUX can either select a first series of instruction from bus 36 or a 
second series from bus 35. 

e) wherein said plurality of series of instructions are issued to said plurality of arithmetic units 
from one of the plurality of instruction memories and are issued respectively from said plurality 
of instruction memories to enable said plurality of arithmetic units to be simultaneously and 
independently driven. It should be noted that it is inherent that a plurality of series (groups) of 
instructions are issued to arithmetic units from one of the instruction memories. Clearly, all 
instructions originate in instruction memory. Furthermore, applicant has used an "or" clause 
which means the examiner only needs to show the teaching of one of the limitations separated by 
the "or" clause. 

f) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including different commands for each of the plurality of instruction 



Application/Control Number: 09/855,776 Page 23 

Art Unit: 2183 

control units. See Fig.4 and note that when the processor is in MIMD mode, a first series of 
instructions is fetched by unit 20a and sent to arithmetic units 26 in component 24a, and a 
completely different series of instructions is fetched by unit 20b and sent to arithmetic units 26 in 
component 24b. See column 3, lines 62-67. 

g) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including a same command for each of the plurality of instruction 
control units. See Fig.3 and note that when the processor is in SEMD mode, a single series of 
instructions is fetched by unit 20a and the series is sent to multiple arithmetic units (components 
26 in components 24a and 24b). With SIMD, the instructions are the same. See column 3, lines 
47-61, and column 11, line 60, to column 12, line 4. 

37. Claim 13 is rejected under 35 U.S.C. 103(a) as being unpatentable over Fernando, as 
applied above, in view of Parady, as applied above. 

38. Referring to claim 13, Fernando has taught a processor control apparatus as described in 
claim 10. 

a) Fernando has not explicitly taught an instruction queue for temporarily storing, at a stage prior 
to said selector, a series of instructions to be transmitted from a second one of said instruction 
memories different from a first one of said instruction memories which stores a series of 
instructions being executed by said first predetermined arithmetic unit. However, Parady has 
taught employing multiple instruction queues for holding different series of instructions before 
one of them is selected. See Fig.3, components 102-108. A person of ordinary skill in the art 
would have recognized that an instruction queue allows a system to hold a series of instructions 
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within the system. Instructions from the queue will be retrieved faster than if they had to be 
retrieved from main memory since main memory is slower than an on-chip memory, such as 
Parady's queues. Consequently, it would have been obvious to one of ordinary skill in the art at 
the time of the invention to modify Fernando to include instruction queues on-chip to hold series 
of instruction so that the instructions are retrieved faster than they would from main memory, 
b) and a determiner for determining, based on a series of instructions being executed, whether or 
not the process being performed by said first predetermined arithmetic unit can be interrupted, 
said determiner operating to output, if the process can be interrupted, an interrupt signal for 
interrupting the issuance of the series of instructions to said first instruction memory which is a 
source of the series of instructions being executed, and generate a switching instruction to said 
selector to switch to a series of instructions from said instruction queue. Looking at Fig.l, it 
should be realized that if the first arithmetic unit 24b is executing a stream of instructions from 
bus 36 and a DFORK instruction is encountered by decoder 22a, then the MUX for the first 
arithmetic unit will be configured such that the execution of a first stream of instructions is 
interrupted in order to execute a second stream of instructions from bus 35. See column 5, lines 
17-25, and column 7, lines 17-29. So, in essence, Fernando has taught interrupting a first 
process in order to execute a second process. 

39. Claims 18-22 and 26-27 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Fernando in view of Mohamed, as applied above, in view of Dowling, as applied above. 

40. Referring to claim 18, Fernando has taught a processor control apparatus for controlling a 
plurality of arithmetic units (Fig.l, components 24a and 24b), said processor control apparatus 



Application/Control Number: 09/855,776 Page 25 

Art Unit: 2183 

comprising a plurality of instruction control units for instructing said arithmetic units to execute 
a series of instructions, wherein each of said instruction control units includes: 

a) an instruction memory for storing a plurality of series of instructions and supplying them to 
arithmetic units. See Fig.l, components 12 and 26. Fernando has not taught that each 
instruction control unit includes an instruction memory for storing a plurality of series of 
instructions, i.e. a plurality of instruction memories. However, Official Notice is taken that 
having a plurality of memories for storing series of instructions is well known and accepted in 
the art. In addition, as shown in Nerwin v. Erlichman . 168 USPQ 177 (1969), to make separable 
is generally not given patentable weight or would have been an obvious improvement. For 
instance, a person of ordinary skill in the art would have recognized that by implementing a 
plurality of memories instead of a single memory with concurrent access capabilities, as taught 
by Fernando (evident in Fig. 7 where multiple streams are fetched at once), then the circuitry 
required to allow for concurrent access would be eliminated, thereby reducing the complexity 
and cost of the system. Consequently, it would have been obvious to replace Fernando' s single 
instruction memory with a plurality of instruction memories. 

b) an instruction decoder for decoding a series of instructions and supplying the decoded series 
of instructions to an associated one of said arithmetic units. See Fig. 1, components 22a and 22b 
(note that the control units may each comprise at least a decoder, fetch unit, and instruction 
memory). 

c) wherein some of said instruction control units each have an instruction control selector for 
selectively switching between a first series of instructions from a first instruction memory of one 
of said instruction control units for simultaneously driving the plurality of arithmetic units upon 
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the instructions including a same command for each of the plurality of instruction control units. 
See Fig. 3 and note that when the processor is in SIMD mode, a single series of instructions is 
fetched by unit 20a and the series is sent to multiple arithmetic units (components 26 in 
components 24a and 24b). With SIMD, the instructions are the same. See column 3, lines 47- 
61, and column 1 1, line 60, to column 12, line 4. 

d) a second series of instructions from a second instruction memory of another instruction 
control unit different from said one instruction control unit for independently driving each of the 
plurality of arithmetic units to output one of said first and second series of instructions thus 
selected to said instruction decoder. See Fig. 4 and note that when the processor is in MEVDD 
mode, a first series of instructions is fetched by unit 20a and sent to arithmetic units 26 in 
component 24a, and a completely different series of instructions is fetched by unit 20b and sent 
to arithmetic units 26 in component 24b. See column 3, lines 62-67. 

e) Fernando has not taught simultaneously driving the plurality of arithmetic units upon the 
instructions including different commands for each of the plurality of instruction control units. 
However, Mohamed has taught a system in which a fetcher/scheduler may issue either SIMD 
instructions (same commands) or VLIW instructions (different commands) to groups of 
arithmetic units. See column 3, lines 51-65, column 4, lines 39-42, and column 5, lines 9-17. A 
person of ordinary skill in the art would have recognized that since Fernando has taught a system 
with multiple functional units, Fernando is capable of executing at least SIMD and VLIW 
instructions. By modifying Fernando to also issue different commands (VLIW) to arithmetic 
units, the system is able to take advantage of horizontal programming while also minimizing 
hazards. See column 1, lines 21-25, and 40-41. As a result, it would have been obvious to one 
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of ordinary skill in the art at the time of the invention to modify Fernando to allow the fetcher 
20a to issue VLIWs with different commands to arithmetic units 26, in addition to SIMDs with 
same commands. One would be motivated to make such a combination because hazards are 
reduced and horizontal programming is advantageous. In addition, it allows for more flexibility. 
That is, instead of fetcher 20a only being limited to issuing same commands to units 26, different 
commands would be issued as well. 

f) Fernando has further taught that each of said arithmetic units includes a register file (say 25b) 
for storing data during execution and the ability to access the register file of another arithmetic 
unit (25a) via bus 38 by using MOVE instructions. See column 6, lines 16-32. Fernando has not 
taught that each arithmetic unit includes a first register file and a second register file for storing 
data generated by said first and second series of instructions, respectively, which are supplied 
from said first and second instruction memories and decoded by an instruction decoder of an 
associated one of said instruction control units, and an arithmetic unit selector for selectively 
switching between said data generated by said first and second series of instructions being 
executed and stored in said first and second register files, respectively, according to an 
instruction from said associated instruction decoder to supply a selected one of said first and 
second series of instructions to a calculator.. However, Dowling has taught such a concept. 
Note from Fig.2, that arithmetic unit 200 includes two register sets 160 and 260 for use in 
executing two different instruction streams 130 and 230. Based on the stream, one of the register 
sets is selected by selector 290. By adding a second set of registers to the arithmetic unit, 
according to column 9, lines 59-62, the processing hardware can quickly switch both internal 
context and external context, i.e. quickly switch execution from one stream of instructions to 
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another, as Fernando has taught. A person of ordinary skill in the art would have also recognized 
that by having a second register set in an arithmetic unit, the MOVE instruction would not be 
required to move data from one set to another in order to facilitate execution of another stream. 
This would also reduce the complexity of the system. Therefore, in order to speed up the system 
and reduce complexity, it would have been obvious to one of ordinary skill in the art at the time 
of the invention to modify Fernando such that each arithmetic unit includes a second register set. 

41 . Referring to claim 19, Fernando in view of Mohamed and further in view of Dowling has 
taught a processor control apparatus as described in claim 18. Fernando has further taught that 
each of said series of instructions includes a VLIW type instruction. It should be rioted that 
Fernando and Mohamed teach VLIW, SIMD, and MIMD. All are considered a form of VLIW. 

42. Referring to claim 20, Fernando in view of Mohamed and further in view of Dowling has 
taught a processor control apparatus as described in claim 18. Fernando has further taught that 
each of said series of instructions includes a series of time-sharing instructions for serially 
driving a plurality of ones of said arithmetic units. For instance, looking at Fig. 7 of Fernando, 
the series of instructions corresponding to thread 2 execute from time 2 to time 7. Therefore, the 
thread takes 6 time units to execute and each individual instruction requires some portion of the 
overall 6 time units for execution. Therefore, each instruction is a time-sharing instruction, i.e., 
each instruction shares the overall 6 time units with the other instructions. 

43. Referring to claim 21, Fernando in view of Mohamed and further in view of Dowling has 
taught a processor control apparatus as described in claim 18. Fernando has further taught power 
control elements for controlling power supply to said arithmetic units based on their instruction 
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executing states. See the abstract and column 8, lines 10-12, and note that unused processing 
elements are deactivated in order to conserve power. 

44. Referring to claim 22, Fernando has taught a processor control apparatus for controlling a 
plurality of arithmetic units (Fig. 1, components 24 and 26), said processor control apparatus 
comprising a plurality of instruction control units (Fig. 1) for instructing said arithmetic units to 
execute a series of instructions. 

a) wherein said instruction control units have a single instruction memory used in common for 
storing a plurality of series of instructions (see Fig.l, component 12), and each comprises an 
instruction decoder for decoding a series of instructions and supplying the decoded series of 
instructions to an associated one of said arithmetic units (see Fig.l, components 22a and 22b, and 
note that the control units may each comprise at least a decoder, fetch unit, and instruction 
memory), and said single instruction memory has a plurality of ports for issuing said series of 
instructions to said respective instruction decoders (See Fig. 1 component 12 and note that 
multiple ports are connected to multiple fetch units and ultimately to multiple decoders). 

b) wherein some of said instruction control units each comprise an instruction control selector for 
selectively switching between a first series of instructions simultaneously driving the plurality of 
arithmetic units upon the instructions including a same command for each of the plurality of 
instruction control units. See Fig.3 and note that when the processor is in SIMD mode, a single 
series of instructions is fetched by unit 20a and the series is sent to multiple arithmetic units 
(components 26 in components 24a and 24b). With SIMD, the instructions are the same. See 
column 3, lines 47-61, and column 1 1, line 60, to column 12, line 4. 
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c) a second series of instructions independently driving each of the plurality of arithmetic units 
both from said single instruction memory to output one of said first and second series of 
instructions thus selected to said instruction decoder. See Fig.4 and note that when the processor 
is in MIMD mode, a first series of instructions is fetched by unit 20a and sent to arithmetic units 
26 in component 24a, and a completely different series of instructions is fetched by unit 20b and 
sent to arithmetic units 26 in component 24b. See column 3, lines 62-67. 

d) Fernando has not taught simultaneously driving the plurality of arithmetic units upon the 
instructions including different commands for each of the plurality of instruction control units. 
However, Mohamed has taught a system in which a fetcher/scheduler may issue either SIMD 
instructions (same commands) or VLIW instructions (different commands) to groups of 
arithmetic units. See column 3, lines 51-65, column 4, lines 39-42, and column 5, lines 9-17. A 
person of ordinary skill in the art would have recognized that since Fernando has taught a system 
with multiple functional units, Fernando is capable of executing at least SIMD and VLIW 
instructions. By modifying Fernando to also issue different commands (VLIW) to arithmetic 
units, the system is able to take advantage of horizontal programming while also minimizing 
hazards. See column 1, lines 21-25, and 40-41. As a result, it would have been obvious to one 
of ordinary skill in the art at the time of the invention to modify Fernando to allow the fetcher 
20a to issue VLIWs with different commands to arithmetic units 26, in addition to SIMDs with 
same commands. One would be motivated to make such a combination because hazards are 
reduced and horizontal programming is advantageous. In addition, it allows for more flexibility. 
That is, instead of fetcher 20a only being limited to issuing same commands to units 26, different 
commands would be issued as well. 
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e) Fernando has further taught wherein each of said arithmetic units includes a register file (say 
25b) for storing data during execution and the ability to access the register file of another 
arithmetic unit (25a) via bus 38 by using MOVE instructions. See column 6, lines 16-32. 
Fernando has not taught that each arithmetic unit includes a first register file and a second 
register file for storing data generated by said first and second series of instructions, respectively, 
which are supplied from said first and second instruction memories and decoded by an 
instruction decoder of an associated one of said instruction control units, and an arithmetic unit 
selector for selectively switching between said data generated by said first and second series of 
instructions being executed and stored in said first and second register files, respectively, 
according to an instruction from said associated instruction decoder to supply a selected one of 
said first and second series of instructions to a calculator.. However, Dowling has taught such a 
concept. Note from Fig.2, that arithmetic unit 200 includes two register sets 160 and 260 for use 
in executing two different instruction streams 130 and 230. Based on the stream, one of the 
register sets is selected by selector 290. By adding a second set of registers to the arithmetic 
unit, according to column 9, lines 59-62, the processing hardware can quickly switch both 
internal context and external context, i.e. quickly switch execution from one stream of 
instructions to another, as Fernando has taught. A person of ordinary skill in the art would have 
also recognized that by having a second register set in an arithmetic unit, the MOVE instruction 
would not be required to move data from one set to another in order to facilitate execution of 
another stream. This would also reduce the complexity of the system. Therefore, in order to 
speed up the system and reduce complexity, it would have been obvious to one of ordinary skill 
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in the art at the time of the invention to modify Fernando such that each arithmetic unit includes 
a second register set. 

45. Referring to claim 26, Fernando has taught a processor comprising: 

a) a plurality of arithmetic units (Fig, 1 , components 24 and 26). 

b) a plurality of instruction control units (Fig. 1) for driving said arithmetic units in a controlled 
manner, wherein each of said instruction control units includes: 

c) a single instruction memory for storing a plurality of series of instructions and supplying them 
to arithmetic units. See Fig. 1, components 12 and 26. Fernando has not taught that each 
instruction control unit includes an instruction memory for storing a plurality of series of 
instructions, i.e. a plurality of instruction memories. However, Official Notice is taken that 
having a plurality of memories for storing series of instructions is well known and accepted in 
the art. In addition, as shown in Nerwin v. Erlichman . 168 USPQ 177 (1969), to make separable 
is generally not given patentable weight or would have been an obvious improvement. For 
instance, a person of ordinary skill in the art would have recognized that by implementing a 
plurality of memories instead of a single memory with concurrent access capabilities, as taught 
by Fernando (evident in Fig. 7 where multiple streams are fetched at once), then the circuitry 
required to allow for concurrent access would be eliminated, thereby reducing the complexity 
and cost of the system. Consequently, it would have been obvious to replace Fernando's single 
instruction memory with a plurality of instruction memories. 

d) an instruction decoder for decoding a series of instructions and supplying the decoded series 
of instructions to an associated one of said arithmetic units. See Fig, 1, components 22a and 22b 
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(note that the control units may each comprise at least a decoder, fetch unit, and instruction 
memory). 

e) wherein some of said instruction control units each have an instruction control selector for 
selectively switching between a first series of instructions from a first instruction memory of one 
of said instruction control units simultaneously driving the plurality of arithmetic units and a 
second series of instructions from a second instruction memory of another instruction control 
unit different from said one instruction control unit independently driving each of the plurality of 
arithmetic units to output one of said first and second series of instructions thus selected to said 
instruction decoder, based on contents of processes to be executed. See Fig. 1, component 21 and 
note that either a first series of instructions along bus 36 may be selected for decoding or a 
second series of instructions from bus 35 may be selected for decoding. When the first series of 
instructions along bus 36 is selected, the system is in SIMD mode (Fig. 3), where the plurality of 
arithmetic units (components 26 in component 24b) are driven at the same time as a second 
group of arithmetic units (components 26 in component 24a) by the same stream of instructions. 
When the second series of instructions along bus 35 is selected, the system is in SIMD mode 
(Fig.4) ? where each of the plurality of arithmetic units (components 26 in component 24b) are 
driven independently of a second group of arithmetic units (components 26 in component 24a) 
by different streams of instructions. Also, from Fig. 5, it should be noted that the switching 
between modes is done based on instructions (contents) of the processes. For instance, when a 
cfork instruction is encountered, SIMD mode is entered where the arithmetic units are 
simultaneously driven by a single stream whereas when a dfork is encountered, the arithmetic 
units are independently driven by multiple instruction streams. 
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f) wherein each of said arithmetic units includes a register file (say 25b) for storing data during 
execution and the ability to access the register file of another arithmetic unit (25a) via bus 38 by 
using MOVE instructions. See column 6, lines 16-32. Fernando has not taught that each 
arithmetic unit includes a first register file and a second register file for storing data generated by 
said first and second series of instructions, respectively, which are supplied from said first and 
second instruction memories and decoded by an instruction decoder of an associated one of said 
instruction control units, and an arithmetic unit selector for selectively switching between said 
data generated by said first and second series of instructions being executed and stored in said 
first and second register files, respectively, according to an instruction from said associated 
instruction decoder to supply a selected one of said first and second series of instructions to a 
calculator.. However, Dowling has taught such a concept. Note from Fig. 2, that arithmetic unit 
200 includes two register sets 160 and 260 for use in executing two different instruction streams 
130 and 230. Based on the stream, one of the register sets is selected by selector 290. By adding 
a second set of registers to the arithmetic unit, according to column 9, lines 59-62, the processing 
hardware can quickly switch both internal context and external context, i.e. quickly switch 
execution from one stream of instructions to another, as Fernando has taught. A person of 
ordinary skill in the art would have also recognized that by having a second register set in an 
arithmetic unit, the MOVE instruction would not be required to move data from one set to 
another in order to facilitate execution of another stream. This would also reduce the complexity 
of the system. Therefore, in order to speed up the system and reduce complexity, it would have 
been obvious to one of ordinary skill in the art at the time of the invention to modify Fernando 
such that each arithmetic unit includes a second register set. 
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g) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including different commands for each of the plurality of instruction 
control units. See Fig. 4 and note that when the processor is in MIMD mode, a first series of 
instructions is fetched by unit 20a and sent to arithmetic units 26 in component 24a, and a 
completely different series of instructions is fetched by unit 20b and sent to arithmetic units 26 in 
component 24b. See column 3, lines 62-67. 

h) the plurality of arithmetic units execute the single series of instructions concurrently upon the 
single series of instructions including a same command for each of the plurality of instruction 
control units. See Fig.3 and note that when the processor is in SEMD mode, a single series of 
instructions is fetched by unit 20a and the series is sent to multiple arithmetic units (components 
26 in components 24a and 24b). With SIMD, the instructions are the same. See column 3, lines 
47-61, and column 11, line 60, to column 12, line 4. 

46. Referring to claim 27, Fernando has taught a processor comprising: 

a) a plurality of arithmetic units (Fig. 1 , components 24 and 26). 

b) a plurality of instruction control units (Fig. 1) for driving said arithmetic units in a controlled 
manner. 

c) wherein said instruction control units have a single instruction memory used in common for 
storing a plurality of series of instructions (see Fig. 1, component 12), and each includes an 
instruction decoder for decoding a series of instructions and supplying the decoded series of 
instructions to an associated one of said arithmetic units (see Fig. 1 , components 22a and 22b, and 
note that the control units may each comprise at least a decoder, fetch unit, and instruction 
memory), and said single instruction memory has a plurality of ports for issuing said series of 
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instructions to said respective instruction decoders (See Fig. 1 component 12 and note that 
multiple ports are connected to multiple fetch units and ultimately to multiple decoders), 
d) wherein some of said instruction control units each have an instruction control selector for 
selectively switching between a first series of instructions simultaneously driving the plurality of 
arithmetic units upon the instructions including a same command for each of the plurality of 
instruction control units. See Fig.3 and note that when the processor is in SIMD mode, a single 
series of instructions is fetched by unit 20a and the series is sent to multiple arithmetic units 
(components 26 in components 24a and 24b). With SIMD, the instructions are the same. See 
column 3, lines 47-61, and column 1 1, line 60, to column 12, line 4. 

d) a second series of instructions independently driving each of the plurality of arithmetic units 
both from said single instruction memory to output one of said first and second series of 
instructions thus selected to said instruction decoder, based on contents of processes to be 
executed. See Fig. 4 and note that when the processor is in MIMD mode, a first series of 
instructions is fetched by unit 20a and sent to arithmetic units 26 in component 24a, and a 
completely different series of instructions is fetched by unit 20b and sent to arithmetic units 26 in 
component 24b. See column 3, lines 62-67. 

e) Fernando has not taught simultaneously driving the plurality of arithmetic units upon the 
instructions including different commands for each of the plurality of instruction control units. 
However, Mohamed has taught a system in which a fetcher/scheduler may issue either SIMD 
instructions (same commands) or VLIW instructions (different commands) to groups of 
arithmetic units. See column 3, lines 51-65, column 4, lines 39-42, and column 5, lines 9-17. A 
person of ordinary skill in the art would have recognized that since Fernando has taught a system 
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with multiple functional units, Fernando is capable of executing at least SIMD and VLIW 
instructions. By modifying Fernando to also issue different commands (VLIW) to arithmetic 
units, the system is able to take advantage of horizontal programming while also minimizing 
hazards. See column 1, lines 21-25, and 40-41. As a result, it would have been obvious to one 
of ordinary skill in the art at the time of the invention to modify Fernando to allow the fetcher 
20a to issue VLIWs with different commands to arithmetic units 26, in addition to'SIMDs with 
same commands. One would be motivated to make such a combination because hazards are 
reduced and horizontal programming is advantageous. In addition, it allows for more flexibility. 
That is, instead of fetcher 20a only being limited to issuing same commands to units 26, different 
commands would be issued as well. 

f) wherein each of said arithmetic units includes a register file (say 25b) for storing data during 
execution and the ability to access the register file of another arithmetic unit (25a) via bus 38 by 
using MOVE instructions. See column 6, lines 16-32. Fernando has not taught that each 
arithmetic unit includes a first register file and a second register file for storing data generated by 
said first and second series of instructions, respectively, which are supplied from said first and 
second instruction memories and decoded by an instruction decoder of an associated one of said 
instruction control units, and an arithmetic unit selector for selectively switching between said 
data generated by said first and second series of instructions being executed and stored in said 
first and second register files, respectively, according to an instruction from said associated 
instruction decoder to supply a selected one of said first and second series of instructions to a 
calculator.. However, Dowling has taught such a concept. Note from Fig. 2, that arithmetic unit 
200 includes two register sets 160 and 260 for use in executing two different instruction streams 
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130 and 230. Based on the stream, one of the register sets is selected by selector 290. By adding 
a second set of registers to the arithmetic unit, according to column 9, lines 59-62, the processing 
hardware can quickly switch both internal context and external context, i.e. quickly switch 
execution from one stream of instructions to another, as Fernando has taught. A person of 
ordinary skill in the art would have also recognized that by having a second register set in an 
arithmetic unit, the MOVE instruction would not be required to move data from one set to 
another in order to facilitate execution of another stream. This would also reduce the complexity 
of the system. Therefore, in order to speed up the system and reduce complexity, it would have 
been obvious to one of ordinary skill in the art at the time of the invention to modify Fernando 
such that each arithmetic unit includes a second register set. 

Conclusion 

47. Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
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however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

48. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. Applicant is reminded that in amending in response to a rejection of claims, the 
patentable novelty must be clearly shown in view of the state of the art disclosed by the 
references cited and the objections made. Applicant must also show how the amendments avoid 
such references and objections. See 37 CFR §1.1 1 1(c). 

Kogge, U.S. Patent No. 5,475,856, has taught a dynamic multi-mode parallel processing 
array in which SIMD and MIMD modes are implemented. Multiple control units are coupled to 
a single memory for receiving a single stream of instructions and also to independent memories 
for receiving independent streams of instruction. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to David J. Huisman whose telephone number is (571) 272-4168. 
The examiner can normally be reached on Monday-Friday (8:00-4:30). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on (571) 272-4162. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



DJH 

David J. Huisman 
November 2, 2005 




